- Introduction
- Materials and methods
- Results and discussion
3.1 Exploratory data analysis 3.2 Modeling
- Conclusion
May 10, 2021
3.1 Exploratory data analysis 3.2 Modeling
PATIENTS.CSV: Contains information about the individuals that received the vaccines
## # A tibble: 34,121 x 35 ## VAERS_ID RECVDATE STATE AGE_YRS CAGE_YR CAGE_MO SEX RPT_DATE SYMPTOM_TEXT ## <chr> <chr> <chr> <dbl> <dbl> <dbl> <chr> <date> <chr> ## 1 0916600 01/01/20… TX 33 33 NA F NA "Right side… ## 2 0916601 01/01/20… CA 73 73 NA F NA "Approximat… ## 3 0916602 01/01/20… WA 23 23 NA F NA "About 15 m… ## # … with 34,118 more rows, and 26 more variables: DIED <chr>, DATEDIED <chr>, ## # L_THREAT <chr>, ER_VISIT <chr>, HOSPITAL <chr>, HOSPDAYS <dbl>, ## # X_STAY <chr>, DISABLE <chr>, RECOVD <chr>, VAX_DATE <chr>, ## # ONSET_DATE <chr>, NUMDAYS <dbl>, LAB_DATA <chr>, V_ADMINBY <chr>, ## # V_FUNDBY <chr>, OTHER_MEDS <chr>, CUR_ILL <chr>, HISTORY <chr>, ## # PRIOR_VAX <chr>, SPLTTYPE <chr>, FORM_VERS <dbl>, TODAYS_DATE <chr>, ## # BIRTH_DEFECT <chr>, OFC_VISIT <chr>, ER_ED_VISIT <chr>, ALLERGIES <chr>
VACCINES.CSV: Contains information about the received vaccine
## # A tibble: 34,630 x 8 ## VAERS_ID VAX_TYPE VAX_MANU VAX_LOT VAX_DOSE_SERIES VAX_ROUTE VAX_SITE ## <chr> <chr> <chr> <chr> <chr> <chr> <chr> ## 1 0916600 COVID19 "MODERNA" 037K20A 1 IM LA ## 2 0916601 COVID19 "MODERNA" 025L20A 1 IM RA ## 3 0916602 COVID19 "PFIZER\\BIONTE… EL1284 1 IM LA ## 4 0916603 COVID19 "MODERNA" unknown <NA> <NA> <NA> ## 5 0916604 COVID19 "MODERNA" <NA> 1 IM LA ## 6 0916606 COVID19 "MODERNA" 011J20A 1 IM LA ## 7 0916607 COVID19 "MODERNA" <NA> <NA> IM LA ## 8 0916608 COVID19 "MODERNA" <NA> 1 IM LA ## 9 0916609 COVID19 "MODERNA" 011J20… 1 IM LA ## 10 0916610 COVID19 "MODERNA" <NA> 1 SYR LA ## # … with 34,620 more rows, and 1 more variable: VAX_NAME <chr>
SYMPTOMS.CSV: Contains information about the symptoms experienced after vaccination
## # A tibble: 48,110 x 11 ## VAERS_ID SYMPTOM1 SYMPTOMVERSION1 SYMPTOM2 SYMPTOMVERSION2 SYMPTOM3 ## <chr> <chr> <dbl> <chr> <dbl> <chr> ## 1 0916600 Dysphagia 23.1 Epiglottitis 23.1 <NA> ## 2 0916601 Anxiety 23.1 Dyspnoea 23.1 <NA> ## 3 0916602 Chest disco… 23.1 Dysphagia 23.1 Pain in ex… ## 4 0916603 Dizziness 23.1 Fatigue 23.1 Mobility d… ## 5 0916604 Injection s… 23.1 Injection s… 23.1 Injection … ## 6 0916606 Pharyngeal … 23.1 <NA> NA <NA> ## # … with 48,104 more rows, and 5 more variables: SYMPTOMVERSION3 <dbl>, ## # SYMPTOM4 <chr>, SYMPTOMVERSION4 <dbl>, SYMPTOM5 <chr>, ## # SYMPTOMVERSION5 <dbl>
The aim of this project is to gain insight on the adverse effects of different Covid-19 vaccines and answer questions such as:
Do some vaccines cause more/different symptoms than others?
Do patients with some profiles get more/different symptoms?
Are certain symptoms correlated with death?
Is patient profile correlated with death?
Does taking anti-inflammatory drugs reduce the chance of having symptoms?
## # A tibble: 3 x 3 ## VAERS_ID OTHER_MEDS TAKES_ANTIINFLAMATORY ## <chr> <chr> <chr> ## 1 0916983 <NA> N ## 2 0916988 Ibuprofen PM the night before Y ## 3 0916996 Clobetasol, Benadryl N
## # A tibble: 3 x 2 ## SEX n ## <chr> <int> ## 1 F 24070 ## 2 M 8514 ## 3 <NA> 828
## # A tibble: 3 x 2 ## VAX_MANU n ## <chr> <int> ## 1 JANSSEN 1106 ## 2 MODERNA 16253 ## 3 PFIZER-BIONTECH 16053
Hypothesis: two peaks corresponding to the innate and acquired immune response
Important verbs and tools used:
Is the patient’s profile (sex, age, allergic/not, ill/not, has/had covid/not) correlated with death?
## # A tibble: 7 x 6 ## term estimate std.error statistic p.value odds_ratio ## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 (Intercept) -9.39 0.161 -58.2 0 0.0000832 ## 2 SEXM 0.929 0.0573 16.2 4.00e-59 2.53 ## 3 AGE_YRS 0.0914 0.00207 44.1 0 1.10 ## 4 HAS_ALLERGIESY -0.0204 0.0605 -0.338 7.35e- 1 0.980 ## 5 HAS_ILLNESSY 1.08 0.0654 16.4 8.86e-61 2.93 ## 6 HAS_COVIDY -0.113 0.142 -0.794 4.27e- 1 0.893 ## 7 HAD_COVIDY -0.00375 0.195 -0.0193 9.85e- 1 0.996
Is the patient’s profile (sex, age, allergic/not, ill/not, has/had covid/not) correlated with death?
Are some symptoms correlated with death?
## # A tibble: 20 x 6 ## term estimate std.error statistic p.value odds_ratio ## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 (Intercept) -2.01 0.0287 -70.1 0 0.134 ## 2 HEADACHETRUE -1.67 0.156 -10.7 7.92e-27 0.188 ## 3 PYREXIATRUE -0.429 0.112 -3.82 1.34e- 4 0.651 ## 4 CHILLSTRUE -1.21 0.171 -7.11 1.17e-12 0.298 ## 5 FATIGUETRUE -0.367 0.115 -3.19 1.41e- 3 0.693 ## 6 PAINTRUE -0.913 0.153 -5.98 2.17e- 9 0.401 ## 7 NAUSEATRUE -0.621 0.139 -4.46 8.17e- 6 0.538 ## 8 DIZZINESSTRUE -2.17 0.193 -11.2 2.87e-29 0.114 ## # … with 12 more rows
Are some symptoms correlated with death?
Does taking anti-inflamatories modify the chance of having symptoms?
## # A tibble: 20 x 9 ## SYMPTOM estimate std.error statistic p.value conf.low conf.high odds_ratio ## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 HEADACHE -0.164 0.0987 -1.67 0.0958 -0.362 0.0255 0.848 ## 2 PYREXIA 0.0152 0.102 0.150 0.881 -0.189 0.211 1.02 ## 3 CHILLS -0.121 0.109 -1.11 0.266 -0.340 0.0875 0.886 ## 4 FATIGUE 0.0565 0.105 0.539 0.590 -0.154 0.258 1.06 ## 5 PAIN 0.0113 0.110 0.102 0.919 -0.210 0.222 1.01 ## # … with 15 more rows, and 1 more variable: identified_as <chr>
Null hypothesis: the proportion of patients that died after getting one vaccine = the proportion of patients that died after getting another vaccine
## # A tibble: 2 x 4 ## DIED JANSSEN MODERNA `PFIZER-BIONTECH` ## <chr> <dbl> <dbl> <dbl> ## 1 N 1090 15281 15212 ## 2 Y 16 972 841
Null hypothesis: the proportion of males that died after vaccination = the proportion of females that died after vaccination
## # A tibble: 2 x 3 ## DIED F M ## <chr> <dbl> <dbl> ## 1 N 23271 7523 ## 2 Y 799 991
Dataset: https://www.kaggle.com/ayushggarg/covid19-vaccine-adverse-reactions?select=2021VAERSSYMPTOMS.csv
Dataset user guide: https://vaers.hhs.gov/docs/VAERSDataUseGuide_November2020.pdf